Overview
Brought to you by YData
Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 541.909 |
| Missing cells | 1.454 |
| Missing cells (%) | < 0.1% |
| Duplicate rows | 4.879 |
| Duplicate rows (%) | 0.9% |
| Total size in memory | 185.5 MiB |
| Average record size in memory | 359.0 B |
Variable types
| Text | 3 |
|---|---|
| Numeric | 7 |
| DateTime | 1 |
| Categorical | 3 |
| Dataset has 4879 (0.9%) duplicate rows | Duplicates |
Month is highly overall correlated with Quarter | High correlation |
Quantity is highly overall correlated with TotalVentas | High correlation |
Quarter is highly overall correlated with Month | High correlation |
TotalVentas is highly overall correlated with Quantity | High correlation |
Country is highly imbalanced (85.9%) | Imbalance |
Year is highly imbalanced (60.4%) | Imbalance |
UnitPrice is highly skewed (γ1 = 186.5069717) | Skewed |
DayOfWeek has 95111 (17.6%) zeros | Zeros |
Reproduction
| Analysis started | 2025-03-21 11:49:12.044240 |
|---|---|
| Analysis finished | 2025-03-21 11:49:40.134233 |
| Duration | 28.09 seconds |
| Software version | ydata-profiling vv4.15.0 |
| Download configuration | config.json |
Variables
InvoiceNo
Text
| Distinct | 25900 |
|---|---|
| Distinct (%) | 4.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.6 MiB |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.0171449 |
| Min length | 6 |
Unique
| Unique | 5.841 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | 536365 |
|---|---|
| 2nd row | 536365 |
| 3rd row | 536365 |
| 4th row | 536365 |
| 5th row | 536365 |
| Value | Count | Frequency (%) |
| 573585 | 1114 | 0.2% |
| 581219 | 749 | 0.1% |
| 581492 | 731 | 0.1% |
| 580729 | 721 | 0.1% |
| 558475 | 705 | 0.1% |
| 579777 | 687 | 0.1% |
| 581217 | 676 | 0.1% |
| 537434 | 675 | 0.1% |
| 580730 | 662 | 0.1% |
| 538071 | 652 | 0.1% |
| Other values (25890) | 534537 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 866996 | |
| 7 | 358618 | |
| 6 | 339129 | 10.4% |
| 4 | 324436 | 9.9% |
| 8 | 248810 | 7.6% |
| 3 | 247661 | 7.6% |
| 0 | 224299 | 6.9% |
| 1 | 219402 | 6.7% |
| 9 | 214831 | 6.6% |
| 2 | 207272 | 6.4% |
| Other values (2) | 9291 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3260745 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 5 | 866996 | |
| 7 | 358618 | |
| 6 | 339129 | 10.4% |
| 4 | 324436 | 9.9% |
| 8 | 248810 | 7.6% |
| 3 | 247661 | 7.6% |
| 0 | 224299 | 6.9% |
| 1 | 219402 | 6.7% |
| 9 | 214831 | 6.6% |
| 2 | 207272 | 6.4% |
| Other values (2) | 9291 | 0.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3260745 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 5 | 866996 | |
| 7 | 358618 | |
| 6 | 339129 | 10.4% |
| 4 | 324436 | 9.9% |
| 8 | 248810 | 7.6% |
| 3 | 247661 | 7.6% |
| 0 | 224299 | 6.9% |
| 1 | 219402 | 6.7% |
| 9 | 214831 | 6.6% |
| 2 | 207272 | 6.4% |
| Other values (2) | 9291 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3260745 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 5 | 866996 | |
| 7 | 358618 | |
| 6 | 339129 | 10.4% |
| 4 | 324436 | 9.9% |
| 8 | 248810 | 7.6% |
| 3 | 247661 | 7.6% |
| 0 | 224299 | 6.9% |
| 1 | 219402 | 6.7% |
| 9 | 214831 | 6.6% |
| 2 | 207272 | 6.4% |
| Other values (2) | 9291 | 0.3% |
StockCode
Text
| Distinct | 4070 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.1 MiB |
Length
| Max length | 12 |
|---|---|
| Median length | 5 |
| Mean length | 5.0868448 |
| Min length | 1 |
Unique
| Unique | 233 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 85123A |
|---|---|
| 2nd row | 71053 |
| 3rd row | 84406B |
| 4th row | 84029G |
| 5th row | 84029E |
| Value | Count | Frequency (%) |
| 85123a | 2380 | 0.4% |
| 22423 | 2203 | 0.4% |
| 85099b | 2159 | 0.4% |
| 47566 | 1727 | 0.3% |
| 20725 | 1639 | 0.3% |
| 84879 | 1502 | 0.3% |
| 22720 | 1477 | 0.3% |
| 22197 | 1476 | 0.3% |
| 21212 | 1385 | 0.3% |
| 20727 | 1350 | 0.2% |
| Other values (3949) | 524648 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 828325 | |
| 1 | 296053 | 10.7% |
| 3 | 259035 | 9.4% |
| 8 | 210898 | 7.7% |
| 9 | 201222 | 7.3% |
| 0 | 197322 | 7.2% |
| 4 | 186057 | 6.7% |
| 7 | 180372 | 6.5% |
| 5 | 180005 | 6.5% |
| 6 | 155713 | 5.6% |
| Other values (41) | 61605 | 2.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2756607 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 828325 | |
| 1 | 296053 | 10.7% |
| 3 | 259035 | 9.4% |
| 8 | 210898 | 7.7% |
| 9 | 201222 | 7.3% |
| 0 | 197322 | 7.2% |
| 4 | 186057 | 6.7% |
| 7 | 180372 | 6.5% |
| 5 | 180005 | 6.5% |
| 6 | 155713 | 5.6% |
| Other values (41) | 61605 | 2.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2756607 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 828325 | |
| 1 | 296053 | 10.7% |
| 3 | 259035 | 9.4% |
| 8 | 210898 | 7.7% |
| 9 | 201222 | 7.3% |
| 0 | 197322 | 7.2% |
| 4 | 186057 | 6.7% |
| 7 | 180372 | 6.5% |
| 5 | 180005 | 6.5% |
| 6 | 155713 | 5.6% |
| Other values (41) | 61605 | 2.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2756607 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 828325 | |
| 1 | 296053 | 10.7% |
| 3 | 259035 | 9.4% |
| 8 | 210898 | 7.7% |
| 9 | 201222 | 7.3% |
| 0 | 197322 | 7.2% |
| 4 | 186057 | 6.7% |
| 7 | 180372 | 6.5% |
| 5 | 180005 | 6.5% |
| 6 | 155713 | 5.6% |
| Other values (41) | 61605 | 2.2% |
Description
Text
| Distinct | 4223 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 1454 |
| Missing (%) | 0.3% |
| Memory size | 43.2 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 28 |
| Mean length | 26.64378 |
| Min length | 1 |
Unique
| Unique | 308 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | WHITE HANGING HEART T-LIGHT HOLDER |
|---|---|
| 2nd row | WHITE METAL LANTERN |
| 3rd row | CREAM CUPID HEARTS COAT HANGER |
| 4th row | KNITTED UNION FLAG HOT WATER BOTTLE |
| 5th row | RED WOOLLY HOTTIE WHITE HEART. |
| Value | Count | Frequency (%) |
| set | 54599 | 2.3% |
| of | 53351 | 2.3% |
| bag | 51911 | 2.2% |
| red | 42902 | 1.8% |
| heart | 39163 | 1.7% |
| retrospot | 35126 | 1.5% |
| vintage | 33748 | 1.4% |
| design | 30066 | 1.3% |
| pink | 29526 | 1.2% |
| christmas | 25131 | 1.1% |
| Other values (2449) | 1973383 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1966406 | ||
| E | 1288969 | 9.0% |
| A | 1093609 | 7.6% |
| T | 956778 | 6.6% |
| R | 918258 | 6.4% |
| O | 864963 | 6.0% |
| I | 788099 | 5.5% |
| S | 777550 | 5.4% |
| N | 716689 | 5.0% |
| L | 705042 | 4.9% |
| Other values (67) | 4323401 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 14399764 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1966406 | ||
| E | 1288969 | 9.0% |
| A | 1093609 | 7.6% |
| T | 956778 | 6.6% |
| R | 918258 | 6.4% |
| O | 864963 | 6.0% |
| I | 788099 | 5.5% |
| S | 777550 | 5.4% |
| N | 716689 | 5.0% |
| L | 705042 | 4.9% |
| Other values (67) | 4323401 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 14399764 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1966406 | ||
| E | 1288969 | 9.0% |
| A | 1093609 | 7.6% |
| T | 956778 | 6.6% |
| R | 918258 | 6.4% |
| O | 864963 | 6.0% |
| I | 788099 | 5.5% |
| S | 777550 | 5.4% |
| N | 716689 | 5.0% |
| L | 705042 | 4.9% |
| Other values (67) | 4323401 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 14399764 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1966406 | ||
| E | 1288969 | 9.0% |
| A | 1093609 | 7.6% |
| T | 956778 | 6.6% |
| R | 918258 | 6.4% |
| O | 864963 | 6.0% |
| I | 788099 | 5.5% |
| S | 777550 | 5.4% |
| N | 716689 | 5.0% |
| L | 705042 | 4.9% |
| Other values (67) | 4323401 |
Quantity
Real number (ℝ)
High correlation 
| Distinct | 722 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.5522495 |
| Minimum | -80995 |
|---|---|
| Maximum | 80995 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 10624 |
| Negative (%) | 2.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | -80995 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 3 |
| Q3 | 10 |
| 95-th percentile | 29 |
| Maximum | 80995 |
| Range | 161990 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 218.08116 |
|---|---|
| Coefficient of variation (CV) | 22.830346 |
| Kurtosis | 119769.16 |
| Mean | 9.5522495 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.26407631 |
| Sum | 5176450 |
| Variance | 47559.391 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 148227 | |
| 2 | 81829 | |
| 12 | 61063 | |
| 6 | 40868 | 7.5% |
| 4 | 38484 | 7.1% |
| 3 | 37121 | 6.9% |
| 24 | 24021 | 4.4% |
| 10 | 22288 | 4.1% |
| 8 | 13129 | 2.4% |
| 5 | 11757 | 2.2% |
| Other values (712) | 63122 |
| Value | Count | Frequency (%) |
| -80995 | 1 | |
| -74215 | 1 | |
| -9600 | 2 | |
| -9360 | 1 | |
| -9058 | 1 | |
| -5368 | 1 | |
| -4830 | 1 | |
| -3667 | 1 | |
| -3167 | 1 | |
| -3114 | 1 |
| Value | Count | Frequency (%) |
| 80995 | 1 | |
| 74215 | 1 | |
| 12540 | 1 | |
| 5568 | 1 | |
| 4800 | 1 | |
| 4300 | 1 | |
| 4000 | 1 | |
| 3906 | 1 | |
| 3186 | 1 | |
| 3114 | 2 |
InvoiceDate
Date
| Distinct | 23260 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Minimum | 2010-12-01 08:26:00 |
|---|---|
| Maximum | 2011-12-09 12:50:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
UnitPrice
Real number (ℝ)
Skewed 
| Distinct | 1630 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.6111136 |
| Minimum | -11062.06 |
|---|---|
| Maximum | 38970 |
| Zeros | 2515 |
| Zeros (%) | 0.5% |
| Negative | 2 |
| Negative (%) | < 0.1% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | -11062.06 |
|---|---|
| 5-th percentile | 0.42 |
| Q1 | 1.25 |
| median | 2.08 |
| Q3 | 4.13 |
| 95-th percentile | 9.95 |
| Maximum | 38970 |
| Range | 50032.06 |
| Interquartile range (IQR) | 2.88 |
Descriptive statistics
| Standard deviation | 96.759853 |
|---|---|
| Coefficient of variation (CV) | 20.984053 |
| Kurtosis | 59005.719 |
| Mean | 4.6111136 |
| Median Absolute Deviation (MAD) | 1.23 |
| Skewness | 186.50697 |
| Sum | 2498804 |
| Variance | 9362.4692 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.25 | 50496 | 9.3% |
| 1.65 | 38181 | 7.0% |
| 0.85 | 28497 | 5.3% |
| 2.95 | 27768 | 5.1% |
| 0.42 | 24533 | 4.5% |
| 4.95 | 19040 | 3.5% |
| 3.75 | 18600 | 3.4% |
| 2.1 | 17697 | 3.3% |
| 2.46 | 17091 | 3.2% |
| 2.08 | 17005 | 3.1% |
| Other values (1620) | 283001 |
| Value | Count | Frequency (%) |
| -11062.06 | 2 | < 0.1% |
| 0 | 2515 | |
| 0.001 | 4 | < 0.1% |
| 0.01 | 1 | < 0.1% |
| 0.03 | 3 | < 0.1% |
| 0.04 | 66 | < 0.1% |
| 0.06 | 117 | < 0.1% |
| 0.07 | 9 | < 0.1% |
| 0.08 | 56 | < 0.1% |
| 0.09 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 38970 | 1 | < 0.1% |
| 17836.46 | 1 | < 0.1% |
| 16888.02 | 1 | < 0.1% |
| 16453.71 | 1 | < 0.1% |
| 13541.33 | 3 | |
| 13474.79 | 1 | < 0.1% |
| 11586.5 | 1 | < 0.1% |
| 11062.06 | 1 | < 0.1% |
| 8286.22 | 1 | < 0.1% |
| 8142.75 | 2 |
CustomerID
Real number (ℝ)
| Distinct | 4372 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15253.867 |
| Minimum | 12346 |
|---|---|
| Maximum | 18287 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 12346 |
|---|---|
| 5-th percentile | 12709 |
| Q1 | 14367 |
| median | 15152 |
| Q3 | 16255 |
| 95-th percentile | 17841 |
| Maximum | 18287 |
| Range | 5941 |
| Interquartile range (IQR) | 1888 |
Descriptive statistics
| Standard deviation | 1485.9059 |
|---|---|
| Coefficient of variation (CV) | 0.097411746 |
| Kurtosis | -0.57699687 |
| Mean | 15253.867 |
| Median Absolute Deviation (MAD) | 941 |
| Skewness | 0.10246327 |
| Sum | 8.266208 × 109 |
| Variance | 2207916.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 15152 | 135358 | 25.0% |
| 17841 | 7983 | 1.5% |
| 14911 | 5903 | 1.1% |
| 14096 | 5128 | 0.9% |
| 12748 | 4642 | 0.9% |
| 14606 | 2782 | 0.5% |
| 15311 | 2491 | 0.5% |
| 14646 | 2085 | 0.4% |
| 13089 | 1857 | 0.3% |
| 13263 | 1677 | 0.3% |
| Other values (4362) | 372003 |
| Value | Count | Frequency (%) |
| 12346 | 2 | < 0.1% |
| 12347 | 182 | |
| 12348 | 31 | < 0.1% |
| 12349 | 73 | |
| 12350 | 17 | < 0.1% |
| 12352 | 95 | |
| 12353 | 4 | < 0.1% |
| 12354 | 58 | < 0.1% |
| 12355 | 13 | < 0.1% |
| 12356 | 59 | < 0.1% |
| Value | Count | Frequency (%) |
| 18287 | 70 | < 0.1% |
| 18283 | 756 | |
| 18282 | 13 | < 0.1% |
| 18281 | 7 | < 0.1% |
| 18280 | 10 | < 0.1% |
| 18278 | 9 | < 0.1% |
| 18277 | 9 | < 0.1% |
| 18276 | 16 | < 0.1% |
| 18274 | 22 | < 0.1% |
| 18273 | 3 | < 0.1% |
Country
Categorical
Imbalance 
| Distinct | 38 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 36.4 MiB |
| United Kingdom | |
|---|---|
| Germany | 9495 |
| France | 8557 |
| EIRE | 8196 |
| Spain | 2533 |
| Other values (33) | 17650 |
Length
| Max length | 20 |
|---|---|
| Median length | 14 |
| Mean length | 13.376203 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | United Kingdom |
|---|---|
| 2nd row | United Kingdom |
| 3rd row | United Kingdom |
| 4th row | United Kingdom |
| 5th row | United Kingdom |
Common Values
| Value | Count | Frequency (%) |
| United Kingdom | 495478 | |
| Germany | 9495 | 1.8% |
| France | 8557 | 1.6% |
| EIRE | 8196 | 1.5% |
| Spain | 2533 | 0.5% |
| Netherlands | 2371 | 0.4% |
| Belgium | 2069 | 0.4% |
| Switzerland | 2002 | 0.4% |
| Portugal | 1519 | 0.3% |
| Australia | 1259 | 0.2% |
| Other values (28) | 8430 | 1.6% |
Length
| Value | Count | Frequency (%) |
| united | 495546 | |
| kingdom | 495478 | |
| germany | 9495 | 0.9% |
| france | 8557 | 0.8% |
| eire | 8196 | 0.8% |
| spain | 2533 | 0.2% |
| netherlands | 2371 | 0.2% |
| belgium | 2069 | 0.2% |
| switzerland | 2002 | 0.2% |
| portugal | 1519 | 0.1% |
| Other values (35) | 10904 | 1.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1023046 | |
| i | 1001404 | |
| d | 998442 | |
| e | 526754 | |
| m | 507621 | |
| t | 504192 | |
| g | 499871 | |
| o | 499396 | |
| 496761 | ||
| U | 496283 | |
| Other values (31) | 694915 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 7248685 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 1023046 | |
| i | 1001404 | |
| d | 998442 | |
| e | 526754 | |
| m | 507621 | |
| t | 504192 | |
| g | 499871 | |
| o | 499396 | |
| 496761 | ||
| U | 496283 | |
| Other values (31) | 694915 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 7248685 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 1023046 | |
| i | 1001404 | |
| d | 998442 | |
| e | 526754 | |
| m | 507621 | |
| t | 504192 | |
| g | 499871 | |
| o | 499396 | |
| 496761 | ||
| U | 496283 | |
| Other values (31) | 694915 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 7248685 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 1023046 | |
| i | 1001404 | |
| d | 998442 | |
| e | 526754 | |
| m | 507621 | |
| t | 504192 | |
| g | 499871 | |
| o | 499396 | |
| 496761 | ||
| U | 496283 | |
| Other values (31) | 694915 |
TotalVentas
Real number (ℝ)
High correlation 
| Distinct | 6204 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.987795 |
| Minimum | -168469.6 |
|---|---|
| Maximum | 168469.6 |
| Zeros | 2515 |
| Zeros (%) | 0.5% |
| Negative | 9290 |
| Negative (%) | 1.7% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | -168469.6 |
|---|---|
| 5-th percentile | 0.83 |
| Q1 | 3.4 |
| median | 9.75 |
| Q3 | 17.4 |
| 95-th percentile | 59.4 |
| Maximum | 168469.6 |
| Range | 336939.2 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 378.81082 |
|---|---|
| Coefficient of variation (CV) | 21.059325 |
| Kurtosis | 151198 |
| Mean | 17.987795 |
| Median Absolute Deviation (MAD) | 6.75 |
| Skewness | -0.96438918 |
| Sum | 9747747.9 |
| Variance | 143497.64 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 15 | 20267 | 3.7% |
| 1.25 | 9550 | 1.8% |
| 2.46 | 9275 | 1.7% |
| 17.7 | 9250 | 1.7% |
| 4.13 | 8811 | 1.6% |
| 16.5 | 8533 | 1.6% |
| 10.2 | 8099 | 1.5% |
| 19.8 | 7676 | 1.4% |
| 3.75 | 7455 | 1.4% |
| 3.29 | 6522 | 1.2% |
| Other values (6194) | 446471 |
| Value | Count | Frequency (%) |
| -168469.6 | 1 | |
| -77183.6 | 1 | |
| -38970 | 1 | |
| -17836.46 | 1 | |
| -16888.02 | 1 | |
| -16453.71 | 1 | |
| -13541.33 | 2 | |
| -13474.79 | 1 | |
| -11586.5 | 1 | |
| -11062.06 | 2 |
| Value | Count | Frequency (%) |
| 168469.6 | 1 | |
| 77183.6 | 1 | |
| 38970 | 1 | |
| 13541.33 | 1 | |
| 11062.06 | 1 | |
| 8142.75 | 1 | |
| 7144.72 | 1 | |
| 6539.4 | 2 | |
| 4992 | 1 | |
| 4921.5 | 1 |
Year
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.6 MiB |
| 2011.0 | |
|---|---|
| 2010.0 | 42481 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2010.0 |
|---|---|
| 2nd row | 2010.0 |
| 3rd row | 2010.0 |
| 4th row | 2010.0 |
| 5th row | 2010.0 |
Common Values
| Value | Count | Frequency (%) |
| 2011.0 | 499428 | |
| 2010.0 | 42481 | 7.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2011.0 | 499428 | |
| 2010.0 | 42481 | 7.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1126299 | |
| 1 | 1041337 | |
| 2 | 541909 | |
| . | 541909 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3251454 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1126299 | |
| 1 | 1041337 | |
| 2 | 541909 | |
| . | 541909 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3251454 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1126299 | |
| 1 | 1041337 | |
| 2 | 541909 | |
| . | 541909 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3251454 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1126299 | |
| 1 | 1041337 | |
| 2 | 541909 | |
| . | 541909 |
Month
Real number (ℝ)
High correlation 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.5531279 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 8 |
| Q3 | 11 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.5090554 |
|---|---|
| Coefficient of variation (CV) | 0.46458307 |
| Kurtosis | -1.1200445 |
| Mean | 7.5531279 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.41481291 |
| Sum | 4093108 |
| Variance | 12.31347 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 84711 | |
| 12 | 68006 | |
| 10 | 60742 | |
| 9 | 50226 | |
| 7 | 39518 | |
| 5 | 37030 | |
| 6 | 36874 | |
| 3 | 36748 | |
| 8 | 35284 | |
| 1 | 35147 | |
| Other values (2) | 57623 |
| Value | Count | Frequency (%) |
| 1 | 35147 | |
| 2 | 27707 | |
| 3 | 36748 | |
| 4 | 29916 | |
| 5 | 37030 | |
| 6 | 36874 | |
| 7 | 39518 | |
| 8 | 35284 | |
| 9 | 50226 | |
| 10 | 60742 |
| Value | Count | Frequency (%) |
| 12 | 68006 | |
| 11 | 84711 | |
| 10 | 60742 | |
| 9 | 50226 | |
| 8 | 35284 | |
| 7 | 39518 | |
| 6 | 36874 | |
| 5 | 37030 | |
| 4 | 29916 | 5.5% |
| 3 | 36748 |
Quarter
Categorical
High correlation 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 31.0 MiB |
| 4.0 | |
|---|---|
| 3.0 | |
| 2.0 | |
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4.0 |
|---|---|
| 2nd row | 4.0 |
| 3rd row | 4.0 |
| 4th row | 4.0 |
| 5th row | 4.0 |
Common Values
| Value | Count | Frequency (%) |
| 4.0 | 213459 | |
| 3.0 | 125028 | |
| 2.0 | 103820 | |
| 1.0 | 99602 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4.0 | 213459 | |
| 3.0 | 125028 | |
| 2.0 | 103820 | |
| 1.0 | 99602 |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 541909 | |
| 0 | 541909 | |
| 4 | 213459 | 13.1% |
| 3 | 125028 | 7.7% |
| 2 | 103820 | 6.4% |
| 1 | 99602 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1625727 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| . | 541909 | |
| 0 | 541909 | |
| 4 | 213459 | 13.1% |
| 3 | 125028 | 7.7% |
| 2 | 103820 | 6.4% |
| 1 | 99602 | 6.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1625727 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| . | 541909 | |
| 0 | 541909 | |
| 4 | 213459 | 13.1% |
| 3 | 125028 | 7.7% |
| 2 | 103820 | 6.4% |
| 1 | 99602 | 6.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1625727 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| . | 541909 | |
| 0 | 541909 | |
| 4 | 213459 | 13.1% |
| 3 | 125028 | 7.7% |
| 2 | 103820 | 6.4% |
| 1 | 99602 | 6.1% |
DayOfWeek
Real number (ℝ)
Zeros 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.4312772 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 95111 |
| Zeros (%) | 17.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.8447087 |
|---|---|
| Coefficient of variation (CV) | 0.75874058 |
| Kurtosis | -0.6568368 |
| Mean | 2.4312772 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.46719471 |
| Sum | 1317531 |
| Variance | 3.4029502 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 103857 | |
| 1 | 101808 | |
| 0 | 95111 | |
| 2 | 94565 | |
| 4 | 82193 | |
| 6 | 64375 |
| Value | Count | Frequency (%) |
| 0 | 95111 | |
| 1 | 101808 | |
| 2 | 94565 | |
| 3 | 103857 | |
| 4 | 82193 | |
| 6 | 64375 |
| Value | Count | Frequency (%) |
| 6 | 64375 | |
| 4 | 82193 | |
| 3 | 103857 | |
| 2 | 94565 | |
| 1 | 101808 | |
| 0 | 95111 |
Hour
Real number (ℝ)
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.078729 |
| Minimum | 6 |
|---|---|
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 11 |
| median | 13 |
| Q3 | 15 |
| 95-th percentile | 17 |
| Maximum | 20 |
| Range | 14 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.4432701 |
|---|---|
| Coefficient of variation (CV) | 0.1868125 |
| Kurtosis | -0.68580894 |
| Mean | 13.078729 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.0055453915 |
| Sum | 7087481 |
| Variance | 5.9695688 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 78709 | |
| 15 | 77519 | |
| 13 | 72259 | |
| 14 | 67471 | |
| 11 | 57674 | |
| 16 | 54516 | |
| 10 | 49037 | |
| 9 | 34332 | |
| 17 | 28509 | 5.3% |
| 8 | 8909 | 1.6% |
| Other values (5) | 12974 | 2.4% |
| Value | Count | Frequency (%) |
| 6 | 41 | < 0.1% |
| 7 | 383 | 0.1% |
| 8 | 8909 | 1.6% |
| 9 | 34332 | |
| 10 | 49037 | |
| 11 | 57674 | |
| 12 | 78709 | |
| 13 | 72259 | |
| 14 | 67471 | |
| 15 | 77519 |
| Value | Count | Frequency (%) |
| 20 | 871 | 0.2% |
| 19 | 3705 | 0.7% |
| 18 | 7974 | 1.5% |
| 17 | 28509 | 5.3% |
| 16 | 54516 | |
| 15 | 77519 | |
| 14 | 67471 | |
| 13 | 72259 | |
| 12 | 78709 | |
| 11 | 57674 |
Interactions
Correlations
| Country | CustomerID | DayOfWeek | Hour | Month | Quantity | Quarter | TotalVentas | UnitPrice | Year | |
|---|---|---|---|---|---|---|---|---|---|---|
| Country | 1.000 | 0.287 | 0.057 | 0.081 | 0.058 | 0.042 | 0.057 | 0.030 | 0.006 | 0.051 |
| CustomerID | 0.287 | 1.000 | 0.017 | 0.044 | 0.028 | -0.109 | 0.043 | -0.131 | -0.014 | 0.078 |
| DayOfWeek | 0.057 | 0.017 | 1.000 | -0.042 | 0.036 | 0.017 | 0.041 | -0.015 | -0.035 | 0.038 |
| Hour | 0.081 | 0.044 | -0.042 | 1.000 | 0.027 | -0.210 | 0.063 | -0.199 | 0.026 | 0.059 |
| Month | 0.058 | 0.028 | 0.036 | 0.027 | 1.000 | -0.025 | 1.000 | -0.031 | -0.003 | 0.466 |
| Quantity | 0.042 | -0.109 | 0.017 | -0.210 | -0.025 | 1.000 | 0.008 | 0.692 | -0.385 | 0.000 |
| Quarter | 0.057 | 0.043 | 0.041 | 0.063 | 1.000 | 0.008 | 1.000 | 0.005 | 0.002 | 0.362 |
| TotalVentas | 0.030 | -0.131 | -0.015 | -0.199 | -0.031 | 0.692 | 0.005 | 1.000 | 0.327 | 0.000 |
| UnitPrice | 0.006 | -0.014 | -0.035 | 0.026 | -0.003 | -0.385 | 0.002 | 0.327 | 1.000 | 0.007 |
| Year | 0.051 | 0.078 | 0.038 | 0.059 | 0.466 | 0.000 | 0.362 | 0.000 | 0.007 | 1.000 |
Missing values
Sample
| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | TotalVentas | Year | Month | Quarter | DayOfWeek | Hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 536365 | 85123A | WHITE HANGING HEART T-LIGHT HOLDER | 6.0 | 2010-12-01 08:26:00 | 2.55 | 17850.0 | United Kingdom | 15.30 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 1 | 536365 | 71053 | WHITE METAL LANTERN | 6.0 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 20.34 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 2 | 536365 | 84406B | CREAM CUPID HEARTS COAT HANGER | 8.0 | 2010-12-01 08:26:00 | 2.75 | 17850.0 | United Kingdom | 22.00 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 3 | 536365 | 84029G | KNITTED UNION FLAG HOT WATER BOTTLE | 6.0 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 20.34 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 4 | 536365 | 84029E | RED WOOLLY HOTTIE WHITE HEART. | 6.0 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 20.34 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 5 | 536365 | 22752 | SET 7 BABUSHKA NESTING BOXES | 2.0 | 2010-12-01 08:26:00 | 7.65 | 17850.0 | United Kingdom | 15.30 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 6 | 536365 | 21730 | GLASS STAR FROSTED T-LIGHT HOLDER | 6.0 | 2010-12-01 08:26:00 | 4.25 | 17850.0 | United Kingdom | 25.50 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 7 | 536366 | 22633 | HAND WARMER UNION JACK | 6.0 | 2010-12-01 08:28:00 | 1.85 | 17850.0 | United Kingdom | 11.10 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 8 | 536366 | 22632 | HAND WARMER RED POLKA DOT | 6.0 | 2010-12-01 08:28:00 | 1.85 | 17850.0 | United Kingdom | 11.10 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| 9 | 536367 | 84879 | ASSORTED COLOUR BIRD ORNAMENT | 32.0 | 2010-12-01 08:34:00 | 1.69 | 13047.0 | United Kingdom | 54.08 | 2010.0 | 12.0 | 4.0 | 2.0 | 8.0 |
| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | TotalVentas | Year | Month | Quarter | DayOfWeek | Hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 541899 | 581587 | 22726 | ALARM CLOCK BAKELIKE GREEN | 4.0 | 2011-12-09 12:50:00 | 3.75 | 12680.0 | France | 15.00 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541900 | 581587 | 22730 | ALARM CLOCK BAKELIKE IVORY | 4.0 | 2011-12-09 12:50:00 | 3.75 | 12680.0 | France | 15.00 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541901 | 581587 | 22367 | CHILDRENS APRON SPACEBOY DESIGN | 8.0 | 2011-12-09 12:50:00 | 1.95 | 12680.0 | France | 15.60 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541902 | 581587 | 22629 | SPACEBOY LUNCH BOX | 12.0 | 2011-12-09 12:50:00 | 1.95 | 12680.0 | France | 23.40 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541903 | 581587 | 23256 | CHILDRENS CUTLERY SPACEBOY | 4.0 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 16.60 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541904 | 581587 | 22613 | PACK OF 20 SPACEBOY NAPKINS | 12.0 | 2011-12-09 12:50:00 | 0.85 | 12680.0 | France | 10.20 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541905 | 581587 | 22899 | CHILDREN'S APRON DOLLY GIRL | 6.0 | 2011-12-09 12:50:00 | 2.10 | 12680.0 | France | 12.60 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541906 | 581587 | 23254 | CHILDRENS CUTLERY DOLLY GIRL | 4.0 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 16.60 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541907 | 581587 | 23255 | CHILDRENS CUTLERY CIRCUS PARADE | 4.0 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 16.60 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
| 541908 | 581587 | 22138 | BAKING SET 9 PIECE RETROSPOT | 3.0 | 2011-12-09 12:50:00 | 4.95 | 12680.0 | France | 14.85 | 2011.0 | 12.0 | 4.0 | 4.0 | 12.0 |
Duplicate rows
Most frequently occurring
| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | TotalVentas | Year | Month | Quarter | DayOfWeek | Hour | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1613 | 555524 | 22698 | PINK REGENCY TEACUP AND SAUCER | 1.0 | 2011-06-05 11:37:00 | 2.95 | 16923.0 | United Kingdom | 2.95 | 2011.0 | 6.0 | 2.0 | 6.0 | 11.0 | 20 |
| 1612 | 555524 | 22697 | GREEN REGENCY TEACUP AND SAUCER | 1.0 | 2011-06-05 11:37:00 | 2.95 | 16923.0 | United Kingdom | 2.95 | 2011.0 | 6.0 | 2.0 | 6.0 | 11.0 | 12 |
| 3202 | 572861 | 22775 | PURPLE DRAWERKNOB ACRYLIC EDWARDIAN | 12.0 | 2011-10-26 12:46:00 | 1.25 | 14102.0 | United Kingdom | 15.00 | 2011.0 | 10.0 | 4.0 | 2.0 | 12.0 | 8 |
| 347 | 538514 | 21756 | BATH BUILDING BLOCK WORD | 1.0 | 2010-12-12 14:27:00 | 5.95 | 15044.0 | United Kingdom | 5.95 | 2010.0 | 12.0 | 4.0 | 6.0 | 14.0 | 6 |
| 474 | 540524 | 21756 | BATH BUILDING BLOCK WORD | 1.0 | 2011-01-09 12:53:00 | 5.95 | 16735.0 | United Kingdom | 5.95 | 2011.0 | 1.0 | 1.0 | 6.0 | 12.0 | 6 |
| 528 | 541266 | 21754 | HOME BUILDING BLOCK WORD | 1.0 | 2011-01-16 16:25:00 | 5.95 | 15673.0 | United Kingdom | 5.95 | 2011.0 | 1.0 | 1.0 | 6.0 | 16.0 | 6 |
| 529 | 541266 | 21755 | LOVE BUILDING BLOCK WORD | 1.0 | 2011-01-16 16:25:00 | 5.95 | 15673.0 | United Kingdom | 5.95 | 2011.0 | 1.0 | 1.0 | 6.0 | 16.0 | 6 |
| 3146 | 572344 | M | Manual | 48.0 | 2011-10-24 10:43:00 | 1.50 | 14607.0 | United Kingdom | 72.00 | 2011.0 | 10.0 | 4.0 | 0.0 | 10.0 | 6 |
| 4249 | 578289 | 23395 | BELLE JARDINIERE CUSHION COVER | 1.0 | 2011-11-23 14:07:00 | 3.75 | 17841.0 | United Kingdom | 3.75 | 2011.0 | 11.0 | 4.0 | 2.0 | 14.0 | 6 |
| 186 | 537224 | 70007 | HI TEC ALPINE HAND WARMER | 1.0 | 2010-12-05 16:24:00 | 1.65 | 13174.0 | United Kingdom | 1.65 | 2010.0 | 12.0 | 4.0 | 6.0 | 16.0 | 5 |